[Spec Decode] Support EAGLE for Qwen3 MoE #26241

seven-mile · 2025-10-05T04:48:08Z

Purpose

Test Plan

vllm serve \
    Qwen/Qwen3-30B-A3B \
    --host 0.0.0.0 \
    --port 7000 \
    --seed 42 \
    -dp 2 \
    --enable-expert-parallel \
    --enforce-eager \
    --max-model-len 4096 \
    --gpu_memory_utilization 0.8 \
    --speculative-config '{"model":"Tengyunw/qwen3_30b_moe_eagle3","num_speculative_tokens":4}'

lm_eval --model local-completions \
  --tasks gsm8k \
  --model_args model=Qwen/Qwen3-30B-A3B,base_url=http://0.0.0.0:7000/v1/completions,num_concurrent=1,max_retries=3,tokenized_requests=False

Test Result

w/ spec

Tasks	Version	Filter	n-shot	Metric		Value		Stderr
gsm8k	3	flexible-extract	5	exact_match	↑	0.8484	±	0.0099
		strict-match	5	exact_match	↑	0.8984	±	0.0083

w/o spec

Tasks	Version	Filter	n-shot	Metric		Value		Stderr
gsm8k	3	flexible-extract	5	exact_match	↑	0.8506	±	0.0098
		strict-match	5	exact_match	↑	0.8984	±	0.0083

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

mergify · 2025-10-07T01:21:02Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @seven-mile.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

Signed-off-by: seven-mile <i@7li.moe>

seven-mile · 2025-10-09T14:24:42Z

Just forgot to mark it ready. Close since a similar patch #26485 has started review.

mergify bot added the qwen Related to Qwen models label Oct 5, 2025

seven-mile changed the title ~~[SpecDecode] Support EAGLE for Qwen3 MoE~~ [Spec Decode] Support EAGLE for Qwen3 MoE Oct 5, 2025

mergify bot added the needs-rebase label Oct 7, 2025

[SpecDecode] Support EAGLE for Qwen3 MoE

cb27cef

Signed-off-by: seven-mile <i@7li.moe>

seven-mile force-pushed the add-qwen3moe-eagle3 branch from 58ed49c to cb27cef Compare October 7, 2025 01:25

mergify bot removed the needs-rebase label Oct 7, 2025

seven-mile closed this Oct 9, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Spec Decode] Support EAGLE for Qwen3 MoE #26241

[Spec Decode] Support EAGLE for Qwen3 MoE #26241

Uh oh!

seven-mile commented Oct 5, 2025 •

edited by github-actions bot

Loading

Uh oh!

mergify bot commented Oct 7, 2025

Uh oh!

seven-mile commented Oct 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

[Spec Decode] Support EAGLE for Qwen3 MoE #26241

[Spec Decode] Support EAGLE for Qwen3 MoE #26241

Uh oh!

Conversation

seven-mile commented Oct 5, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

w/ spec

w/o spec

Uh oh!

mergify bot commented Oct 7, 2025

Uh oh!

seven-mile commented Oct 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

seven-mile commented Oct 5, 2025 •

edited by github-actions bot

Loading